Dataset used for paper "Issues-Driven Features for Software Fault Prediction". The dataset contains 86 projects from the open source organizations Apache and Spring were written in Java that managed their source code using the Git version control system and an issue tracking system (JIRA or BUGZILLA). For each project, we extracted data for software fault prediction (SFL) task as follows: First, we filtered out projects without reported resolved bugs or less than 5 released versions. Then we iterated the resolved bugs and mapped them to the commits that fixed them. Next, for each version, we labeled the faulty files in the version. A faulty file is a file that was modified in a commit in the version that resolved a bug. ...
A Public Unified Bug Dataset for Java and its Assessment Regarding Metrics and Bug Prediction. Onli...
Abstract—Software quality researchers build software qual-ity models by recovering traceability link...
Mining software repositories is a growing research field where rich data available in the different ...
About the Data They download Herzig et al.’s datasets which included the identiers of issue reports...
The number of research papers on defect prediction has sharply increased for the last decade or so. ...
This data set will be released as part of the following publication. "Root cause prediction based on...
One of the important aims of the continuous software development process is to localize and remove a...
Context: Defect prediction research is based on a small number of defect datasets and most are at cl...
An important goal during the cycle of software development is to find and fix existing defects as ea...
Context The SZZ algorithm is the de facto standard for labeling bug fixing commits and finding indu...
The dataset that was made by downloading top 500 starred Java projects from GitHub and then eliminat...
Most software fault proneness prediction techniques utilize machine learning models which act as bla...
Two recent studies explicitly recommend labeling defective classes in releases using the affected ve...
Abstract—Detecting bugs as early as possible plays an impor-tant role in ensuring software quality b...
This Dataset contains data for performing fault-proneness, defect prediction or any other kind of re...
A Public Unified Bug Dataset for Java and its Assessment Regarding Metrics and Bug Prediction. Onli...
Abstract—Software quality researchers build software qual-ity models by recovering traceability link...
Mining software repositories is a growing research field where rich data available in the different ...
About the Data They download Herzig et al.’s datasets which included the identiers of issue reports...
The number of research papers on defect prediction has sharply increased for the last decade or so. ...
This data set will be released as part of the following publication. "Root cause prediction based on...
One of the important aims of the continuous software development process is to localize and remove a...
Context: Defect prediction research is based on a small number of defect datasets and most are at cl...
An important goal during the cycle of software development is to find and fix existing defects as ea...
Context The SZZ algorithm is the de facto standard for labeling bug fixing commits and finding indu...
The dataset that was made by downloading top 500 starred Java projects from GitHub and then eliminat...
Most software fault proneness prediction techniques utilize machine learning models which act as bla...
Two recent studies explicitly recommend labeling defective classes in releases using the affected ve...
Abstract—Detecting bugs as early as possible plays an impor-tant role in ensuring software quality b...
This Dataset contains data for performing fault-proneness, defect prediction or any other kind of re...
A Public Unified Bug Dataset for Java and its Assessment Regarding Metrics and Bug Prediction. Onli...
Abstract—Software quality researchers build software qual-ity models by recovering traceability link...
Mining software repositories is a growing research field where rich data available in the different ...